Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | Go Back
View Post [edit]

Poster: LAJ Date: Mar 21, 2006 6:21pm
Forum: netlabels Subject: Reviews ? [odd character being displayed]

I posted four reviews this evening and noticed some rather odd characters showing up in some of them. This one for instance: http://www.archive.org/details/top.19

If someone could let me know what I'm doing wrong (or not doing), I'd appreciate it.

Thanks




This post was modified by LAJ on 2006-03-21 22:24:46

This post was modified by LAJ on 2006-03-22 02:21:06

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or Staffjm / i.standard Date: Mar 22, 2006 7:23pm
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

I'm quite sure I've already asked the question here, but how comes that some reviews are not displayed it the "most recent reviews"?

I wrote a review for this item, and it doesn't seem to show up:

http://www.archive.org/details/top.38

Anyone got a clue?

Reply to this post
Reply [edit]

Poster: LAJ Date: Mar 23, 2006 7:26am
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

jm -

Did you see Transient's response to my post?
http://www.archive.org/iathreads/post-view.php?id=58226

Reply to this post
Reply [edit]

Poster: billmoyer Date: Mar 22, 2006 1:39am
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

Hello, LAJ!

My name is Bill Moyer, and I am an engineer at The Archive. Those odd characters are Latin-1 encodings of "smart quotes", as described here:
http://www.stone.com/The_Cocoa_Files/Smart_Quotes.html

The baseline computer character set only has three kinds of quotation-like symbols:
single-quote ['] (which doubles as an apostrophe),
double-quote ["]
backtick [`]

Over the years, the baseline set has been expanded in different (incompatible) ways by different standards organizations, so that symbols like "smart quotes" can be represented. For (lame) reasons I won't get into right now, The Archive decided to standardize its website code on the UTF8 character set, while most browsers and word processors generate web documents based on the Latin-1 character set.

What this means is that when a user views a review, our servers tell their browser "Expect UTF8 character encodings in this document", and when that document contains Latin-1 encoded characters, the browser doesn't know what to do, and does some weird implementation-dependent thing. For instance, my browser here at home shows me a slanted-A character followed by a dotted outline of a box. One of my browsers at work shows me two small boxes with numbers in them.

I am making an effort to get our website code to do the right thing, but in the meantime we only have workarounds.

I have software which I can run on our servers which sweeps through users' reviews and item descriptions, finds all Latin-1 encoded characters, and converts them to UTF8 encoded characters. That's one workaround.

You might be able to work around this in your browser. Some browsers have a configuration setting for generating UTF8 characters. This will not make that review look any different, but if you edited the review and replaced the funny characters with UTF8 quotation marks / apostrophes then all would be right with the world (until someone else posted a review using Latin-1).

The simple solution is to just use ["'`], but that's "so 1980's" to most people, and some browsers will automatically detect paired "'s and "helpfully" convert them to Smart Quotes for you.

Let me know (ttk@archive.org) if you want me to run my de-latinizer software on your reviews (I will be running it on all of The Archive's reviews in a week or two, I hope). Otherwise I leave it up to you. Eventually I hope we will have a permanent solution in place, but I do not know how long that might take -- the engineers responsible for the website are in a different department from mine.

Sorry for the inconvenience,
-- Bill

Reply to this post
Reply [edit]

Poster: LAJ Date: Mar 22, 2006 5:29am
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

Thank you very much - I appreciate your help.

LAJ

Reply to this post
Reply [edit]

Poster: Administrator, Curator, or Staffdasboogie Date: Mar 21, 2006 7:11pm
Forum: netlabels Subject: Re: Reviews ? [odd character being displayed]

propably you use the wrong character set.